Neural networks and hand-written image recognition

Author: Leonardo Espin

Date: 1/10/2019

import pandas as pd
import numpy as np
import seaborn as sns
import matplotlib.pyplot as plt
import matplotlib.image as mpimg #to work with raster images
%matplotlib inline

The dataframe XDF contains image data that has been flattened to vectors. It contains 5000 training examples of images of 20x20 pixels (a vector of 400 elements)
The dataframe yDF contains the labels for each image. The images corresponding to zero, have been labeled as 10
A neural network with a single hidden layer, with 25 activation units has been trained to classify the images according to the label. The weights of the hidden layer $\theta_{1,0},\dots,\theta_{1,400},\theta_{2,0},\dots,\dots,\theta_{25,400}$ are in the dataframe theta1DF, and the weights of the output layer are in theta2DF

This is a subset of the MNIST handwritten digit dataset

XDF=pd.read_csv('ex3data1-X.csv',header=None)
yDF=pd.read_csv('ex3data1-y.csv',header=None)
theta1DF=pd.read_csv('ex3weights-T1.csv',header=None)
theta2DF=pd.read_csv('ex3weights-T2.csv',header=None)

print(XDF.shape)
XDF.head()

(5000, 400)

print(theta1DF.shape)
print(theta2DF.shape)
theta2DF.head()

(25, 401)
(10, 26)

The images correspond to hand drawn digits (0 to 9), and they can be shown with the imshow command:

tmp=XDF.iloc[0,:].values.reshape(20,20)
plt.imshow(tmp);

Below I show a mozaic of a 100 images selected at random from the training samples (notice that the bitmap arrays have to be transposed to be shown correctly):

import random
#select a 100 images (rows) at random
selection=[random.randint(0,5000) for x in range(100)];

image=np.zeros((20*10, 20*10)) #for constructing a mozaic of 10x10 images
coords=[(x,y) for x in range(1,11) for y in range(1,11)];
for k,tup in enumerate(coords):
    indYa=0+20*(tup[0]-1)
    indYb=19+20*(tup[0]-1)
    indXa=0+20*(tup[1]-1)
    indXb=19+20*(tup[1]-1)
    image[indYa:indYb+1,indXa:indXb+1]=XDF.iloc[selection[k],:].values.reshape(20,20).transpose()

plt.figure(figsize=(8,8))
plt.imshow(image);

The structure of the trained neural network is shown below

image.WPGGVZ.png

Below I apply the neural network to the training set in XDF. Note that colums or rows of ones have to be added to the flattened images to account for the bias units

#add a column (axis=1) of ones (3rd argument) to the image data (1st argument) at 
#the beggining of the matrix (2nd argument)
X=np.insert(XDF.values, 0, 1, axis=1)
X[0:5,0:15]

array([[1., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
       [1., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
       [1., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
       [1., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
       [1., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.]])

Multiplying each image by the hidden layer weights to obtain the $z$ values:

Z2=np.matmul(theta1DF.values,X.transpose())
print(Z2.shape)

(25, 5000)

The values $a^{(2)}_i$, $i=1,\dots,25$ are obtained by applying the sigmoid function (notice that the function is vectorized and applied simultaneously to the whole Z2 matrix):

def g(z):
    return 1/(1+np.exp(-z))

g = np.vectorize(g)

A2=g(Z2)
print(A2.shape)
A2[0:3,0:5]

(25, 5000)

array([[0.0503631 , 0.00805789, 0.01419626, 0.04589353, 0.00188314],
       [0.07939735, 0.05105084, 0.02443577, 0.00361582, 0.07777682],
       [0.99300211, 0.93367502, 0.99751863, 0.99569588, 0.98854938]])

a row of ones is added to the matrix A2 to account for the bias unit

A2=np.insert(A2, 0, 1, axis=0)
A2[0:4,0:5]

array([[1.        , 1.        , 1.        , 1.        , 1.        ],
       [0.0503631 , 0.00805789, 0.01419626, 0.04589353, 0.00188314],
       [0.07939735, 0.05105084, 0.02443577, 0.00361582, 0.07777682],
       [0.99300211, 0.93367502, 0.99751863, 0.99569588, 0.98854938]])

Below are the calculations for the output layer, which has 10 nodes corresponding to the 10 categories of the hand-written symbols

Z3=np.matmul(theta2DF.values,A2)
A3=g(Z3)
print(A3.shape)
print('clasification results of first 4 images (choose max value per column):')
A3[0:10,0:4]

(10, 5000)
clasification results of first 4 images (choose max value per column):

array([[1.12671887e-04, 4.79056232e-04, 8.85776815e-05, 5.57383895e-05],
       [1.74148916e-03, 2.41533708e-03, 3.24329935e-03, 8.05061830e-03],
       [2.52662295e-03, 3.44724408e-03, 2.55394811e-02, 1.78282512e-02],
       [1.84041460e-05, 4.05623351e-05, 2.13624508e-05, 8.65244918e-05],
       [9.36362070e-03, 6.53492371e-03, 3.96943735e-03, 6.42347222e-04],
       [3.99261936e-03, 1.75928814e-03, 1.02875046e-02, 1.14645561e-02],
       [5.51521866e-03, 1.15783184e-02, 3.86827771e-04, 1.85173645e-03],
       [4.01439251e-04, 2.39088978e-03, 6.22854803e-02, 4.69276085e-03],
       [6.48144766e-03, 1.97051630e-03, 5.49898509e-03, 8.21379857e-04],
       [9.95733748e-01, 9.95696578e-01, 9.28004235e-01, 9.94103811e-01]])

classification=np.argmax(A3,axis=0)
classification

array([9, 9, 9, ..., 8, 8, 8])

Below I show a few classification results chosen at random

import time 

for _ in range(4):
    k=random.randint(0,100)
    k=selection[k]
    print('learned value = {}'.format(classification[k]+1)) 
    tmp=XDF.iloc[k,:].values.reshape(20,20).transpose()
    plt.imshow(tmp,animated=True)
    plt.show()
    time.sleep(1.5)

learned value = 2

learned value = 7

learned value = 10

learned value = 7

The overall classification accuracy is:

accuracy=(100*sum(classification.reshape(yDF.values.shape) == yDF.values-1)
          /len(classification))[0]
print('classification accuracy: {}%'.format(accuracy))

classification accuracy: 97.52%

Using scikit-learn to train the classification Neural Network¶

from sklearn.neural_network import MLPClassifier

#the solver chosen below works better with small sets. otherwise use 
#stochastic gradient descent or other options
clf = MLPClassifier(solver='adam',              #adam is the default
                    alpha=1e-3,                 #reg. parameter lambda =1/(2*500)
                    hidden_layer_sizes=(25,),   #1 hidden layer with 25 units
                    activation='logistic',      #the logistic sigmoid function, could change to relu
                    #max_iter=400,
                    validation_fraction=0.2)

clf.fit(XDF.values, yDF.values.flatten())#flatten reshapes the 5000x1 col matrix to 1D array
                                         #otherwise sklearn complaints

/home/perro/anaconda3/lib/python3.6/site-packages/sklearn/neural_network/multilayer_perceptron.py:566: ConvergenceWarning: Stochastic Optimizer: Maximum iterations (200) reached and the optimization hasn't converged yet.
  % self.max_iter, ConvergenceWarning)

MLPClassifier(activation='logistic', alpha=0.001, batch_size='auto', beta_1=0.9,
              beta_2=0.999, early_stopping=False, epsilon=1e-08,
              hidden_layer_sizes=(25,), learning_rate='constant',
              learning_rate_init=0.001, max_iter=200, momentum=0.9,
              n_iter_no_change=10, nesterovs_momentum=True, power_t=0.5,
              random_state=None, shuffle=True, solver='adam', tol=0.0001,
              validation_fraction=0.2, verbose=False, warm_start=False)

The learned coefficients are below. Notice that we obtain a very high classification accuracy because in this example we are training with just 5000 images (8% of entire dataset), so the model is most likely overfitting.

T1=clf.coefs_[0]
T2=clf.coefs_[1]
print(T1.shape)
print(T2.shape)
print('theta_0 coefficients:')
print(clf.intercepts_[0].shape)
print(clf.intercepts_[1].shape)

(400, 25)
(25, 10)
theta_0 coefficients:
(25,)
(10,)

accuracy=(100*sum(clf.predict(XDF.values).reshape(yDF.values.shape) == yDF.values)
          /len(classification))[0]
print('classification accuracy: {}%'.format(accuracy))

classification accuracy: 98.88%

#or more easily
clf.score(XDF.values, yDF.values)

0.9888

A Keras DNN trained on the entire dataset can be seen here: Keras and the MNIST dataset

	0	1	2	3	4	5	6	7	8	9	...	16	17	18	19	20	21	22	23	24	25
0	-0.76100	-1.21240	-0.10187	-2.36850	-1.05780	-2.208200	0.56384	1.21110	2.21030	0.44456	...	-0.23366	-1.5201	1.15320	0.10368	-0.37208	-0.61530	-0.12568	-2.271900	-0.71836	-1.29690
1	-0.61785	0.61559	-1.26550	1.85750	-0.91853	-0.055026	-0.38590	1.29520	-1.56840	-0.97026	...	-2.44170	-0.8563	-0.29826	-2.07950	-1.29330	0.89982	0.28307	2.311800	-2.46440	1.45660
2	-0.68934	-1.94540	2.01360	-3.12320	-0.23618	1.386800	0.90982	-1.54770	-0.79831	-0.65600	...	-1.63900	1.2027	-1.20250	-1.83450	-1.88010	-0.34056	0.23692	-1.061400	1.02760	-0.47691
3	-0.67832	0.46299	0.58492	-0.16502	1.93260	-0.229660	-1.84730	0.49012	1.07150	-3.31910	...	-0.68428	-1.6471	0.21153	-0.27422	1.72600	1.32420	-2.63980	-0.080559	-2.03510	-1.46120
4	-0.59664	-2.04480	2.05700	1.95100	0.17638	-2.161400	-0.40395	1.80160	-1.56280	-0.25253	...	-0.67489	1.1407	1.32430	3.21160	-2.15890	-2.60160	-3.22260	-1.896100	-0.87488	2.51040

	0	1	2	3	4	5	6	7	8	9	...	390	391	392	393	394	395	396	397	398	399
0	0	0	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0	...	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0
1	0	0	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0	...	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0
2	0	0	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0	...	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0
3	0	0	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0	...	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0
4	0	0	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0	...	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0

	0	1	2	3	4	5	6	7	8	9	...	390	391	392	393	394	395	396	397	398	399
0	0	0	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0	...	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0
1	0	0	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0	...	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0
2	0	0	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0	...	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0
3	0	0	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0	...	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0
4	0	0	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0	...	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0

	0	1	2	3	4	5	6	7	8	9	...	390	391	392	393	394	395	396	397	398	399
0	0	0	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0	...	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0
1	0	0	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0	...	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0
2	0	0	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0	...	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0
3	0	0	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0	...	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0
4	0	0	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0	...	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0.0	0